Software Systems for Vision-Based Spatial Interaction
نویسندگان
چکیده
The VICs project is exploring the development of modular, adaptable software systems for vision-based, human-computer interaction. The key idea of this approach is to use local visual interaction cues (VICs) on a video stream shared between the user and the machine. A VIC consists of a graphical representation (e.g. an icon) superimposed on the video stream (thus visible to the user), associated image processing algorithms for activating the cue, and other application-specific code. The video stream could be monocular or stereo, enabling 2-D and 3-D interaction and may be combined with speech or haptics to provide enhanced interaction capabilities. VICs are intended to be used in situations where large-scale spatial motion, particularly hand-eye coordinated motion, is essential. For example, manipulating large volumes (visualized graphically) can be done by annotating the model with icons that can be grasped and released. Interaction with real physical systems is also possible. For example, a surgeon viewing a retinal image through a stereo microscope may lay out icons on the surface of the retina to mark areas of damage. These icons may be moved and manipulated using the gestural cues (now performed with surgical tools), and may ultimately be used to target therapeutic drugs or other interventions. VICs were designed around local (in the visual field) visual cues to avoid the problem of general tracking of human motion. This approach strongly limits image processing, allows that image processing to be dynamically driven by the interaction, and it provides an easily parameterized, modular basis for software system development. Another important notion in VICS is that the user is intentionally attempting to interact with the system. To this end, training is performed to recognize intentional gestures by the user (e.g. pressing a button) vs. unintentional motions (e.g. moving a hand over a button). Finally, the idea of statically typed, timeinvariant behavior specification is being explored as a software basis for VICs and their compositions. Canonical libraries of vision algorithms for different classes of VICs are being developed. To date, we have reported on two specific aspects of VICS. For systems that paint an interface on a surface, a recent report [1] details methods for fast, direct stereo for surface registration and tracking. For the specific case of a planar surface, we have shown that we can register the incoming streams through slew rates exceeding 1350 deg/sec and translations of up to 3 m/s using less that 20% of a standard PC. Thus, we can maintain a consistent view during normal user head motion. Once two incoming data streams are registered, a foreground/background segmentation is used to drive training and subsequent application of a hidden Markov model for recognizing gestures. We have shown [2] that it is possible to train HMMs to detect button press gestures with more than 98% accuracy and with extremely low (< 0.5%) false positive rates.
منابع مشابه
Human Computer Interaction Using Vision-Based Hand Gesture Recognition
With the rapid emergence of 3D applications and virtual environments in computer systems; the need for a new type of interaction device arises. This is because the traditional devices such as mouse, keyboard, and joystick become inefficient and cumbersome within these virtual environments. In other words, evolution of user interfaces shapes the change in the Human-Computer Interaction (HCI). In...
متن کاملHuman Computer Interaction Using Vision-Based Hand Gesture Recognition
With the rapid emergence of 3D applications and virtual environments in computer systems; the need for a new type of interaction device arises. This is because the traditional devices such as mouse, keyboard, and joystick become inefficient and cumbersome within these virtual environments. In other words, evolution of user interfaces shapes the change in the Human-Computer Interaction (HCI). In...
متن کاملSecond-Order Statistical Texture Representation of Asphalt Pavement Distress Images Based on Local Binary Pattern in Spatial and Wavelet Domain
Assessment of pavement distresses is one of the important parts of pavement management systems to adopt the most effective road maintenance strategy. In the last decade, extensive studies have been done to develop automated systems for pavement distress processing based on machine vision techniques. One of the most important structural components of computer vision is the feature extraction met...
متن کاملComparison of Different Targets Used in Augmented Reality Applications in Ubiquitous GIS
Drilling requires accurate information about locations of underground infrastructures or it can cause serious damages. Augmented Reality (AR) as a technology in Ubiquitous GIS (UBIGIS) can be used to visualize underground infrastructures on smartphones. Since smartphone’s sensors do not provide such accuracy, another approaches should be applied. Vision based computer vision systems are well kn...
متن کاملTwo New Methods of Boundary Correction for Classifying Textural Images
With the growth of technology, supervising systems are increasingly replacing humans in military, transportation, medical, spatial, and other industries. Among these systems are machine vision systems which are based on image processing and analysis. One of the important tasks of image processing is classification of images into desirable categories for the identification of objects or their sp...
متن کاملRobot Motion Vision Pait I: Theory
A direct method called fixation is introduced for solving the general motion vision problem, arbitrary motion relative to an arbitrary environment. This method results in a linear constraint equation which explicitly expresses the rotational velocity in terms of the translational velocity. The combination of this constraint equation with the Brightness-Change Constraint Equation solves the gene...
متن کامل